Set-up

Run this chunk first.

Day 1

  • se = \(\sigma\)/sqrt(n)
  • Type M (for magnitude) error happens when power is low
H0 True H0 False
Accept All good Type II
1 - \(\beta\)
Reject Type I
\(\alpha\) = .05
Power
\(\beta\)

Lecture 1.1 - Discrete random variables

In addition to r/d/pbinom, we can use r/d/pbern (for Bernoulli, but since Bernoulli = binomial these two families of functions are identical).

extraDistr::rbern(n = 10, prob = .5)
##  [1] 0 0 0 0 0 0 0 1 0 1
# probability for possible outcomes (0 or 1, 'cause binomial)
extraDistr::dbern(0, prob = .7)
## [1] 0.3
extraDistr::dbern(1, prob = .7) 
## [1] 0.7
# probability of getting a success (=1) or failure (=0) given the probability of .7 (e.g., if probability of tossing a coin and gatting a tails is .7 for some reason, then for a single coin toss what is the probability of getting a 1 or 0, given prob = .7?)

And cumulative probability distribution function with the bern family of functions:

extraDistr::pbern(1.,p=.5)
## [1] 1

Lecture 1.2: Discrete random variables (the binomial)

Lecture 1.3: Continuous random variables

p/d/rnorm family of functions for continuous:

# probability density function (PDF)
dnorm( # probability of observing
  400, # 400ms
  mean = 500, # when the true mean is 500
  sd = 100) # and the sd is 100
## [1] 0.002419707
# cumulative density function (CDF)
pnorm( # probability of observing
  400, # 400ms *or lower*
  mean = 500, # when the true mean is 500
  sd = 100) # and the sd is 100
## [1] 0.1586553
# inverse cumulative density function (iCDF)
qnorm( # k (quantile) with a CDF of
  0.543, # 0.543
  mean = 500, # when the true mean is 500
  sd = 100) # and the sd is 100
## [1] 510.7995

(at this point my notes are no longer linked to a particular lecture)

Once you’ve generated a prior, these functions come in handy.

  • common misunderstanding of the maximum likelihood estimate (MLE): it doens’t represent the true value of \(\theta\), because it’s the MLE (best guess) for the data you have
    • but the MLE will be closer to the true value of \(\theta\) as sample size increases

Bivariate and multivariate distributions (discrete)

  • joint probability mass function (joint PMF): for when we are considering pairs of observations

Bivariate and multivariate distributions (continuous)

  • consider 2 random variables, X and Y, each with a normal distribution, e.g., each have the distribution Normal(0,1), each with 1000 observations. In this case, they’re completely uncorrelated:
x <- rnorm(1000,0,1)
y <- rnorm(1000,0,1)

plot(x,y)

cor(x,y)
## [1] 0.01076854

If we want to describe a correlation between these two, we can find their joint distribution (bivariate distribution)

  • covariance of two random variables = correation x sd of each variable: \(\rho\)*\(\sigma\)*\(\sigma\)
# calculate covariance of observed x and y:
cor(x,y)*sd(x)*sd(y)
## [1] 0.0108706
# or simply use the cov() function
cov(x,y)
## [1] 0.0108706

Generate simulated bivariate data

## define a variance-covariance matrix:
Sigma <- matrix(c(5^2, 5 * 10 * .6, 5 * 10 * .6, 10^2),
byrow = FALSE, ncol = 2
)
## generate data:
u <- MASS::mvrnorm(n = 100, # 100 data points
                   mu = c(0, 0), # means for 2 variables
                   Sigma = Sigma) # make sigma the matrix we just defined
# MASS::mvrnorm = "multivariate rnorm"
plot(u[,1],u[,2])

head(u, n = 3)
##            [,1]     [,2]
## [1,]  5.5700090 11.15799
## [2,] -0.2836913 -2.09299
## [3,] 14.2082902 28.39976
# and if we have a NEGATIVE correlation?
## define a variance-covariance matrix:
SigmaM <- matrix(c(5^2, 5 * 10 * -.6, 5 * 10 * -.6, 10^2),
byrow = FALSE, ncol = 2
)
## generate data:
uM <- MASS::mvrnorm(n = 100, # 100 data points
                   mu = c(0, 0), # means for 2 variables
                   Sigma = SigmaM) # make sigma the matrix we just defined
# MASS::mvrnorm = "multivariate rnorm"
plot(uM[,1],uM[,2])

head(uM, n = 3)
##           [,1]      [,2]
## [1,] -1.775278 18.305100
## [2,] -1.517561 -8.477288
## [3,]  2.188085 -3.276041
# and if we have correlation of 0?
## define a variance-covariance matrix:
Sigma0 <- matrix(c(5^2, 5 * 10 * 0, 5 * 10 * 0, 10^2),
byrow = FALSE, ncol = 2
)
## generate data:
u0 <- MASS::mvrnorm(n = 100, # 100 data points
                   mu = c(0, 0), # means for 2 variables
                   Sigma = Sigma0) # make sigma the matrix we just defined
# MASS::mvrnorm = "multivariate rnorm"
plot(u0[,1],u0[,2])

head(u0, n = 3)
##           [,1]      [,2]
## [1,] -5.809083 10.043724
## [2,]  2.987491  1.207535
## [3,]  2.673755  9.657816
library(bcogsci)
data("df_gibsonwu")
head(df_gibsonwu)
##     subj item     type   rt
## 94     1   13  obj-ext 1561
## 221    1    6 subj-ext  959
## 341    1    5  obj-ext  582
## 461    1    9  obj-ext  294
## 621    1   14 subj-ext  438
## 753    1    4 subj-ext  286
df_gibsonwu$cond <- ifelse(df_gibsonwu$type=="obj-ext",-.5,+.5)

m <- lme4::lmer(rt~cond + (1+cond|subj) + (1+cond|item), data = df_gibsonwu)
summary(m)
## Linear mixed model fit by REML ['lmerMod']
## Formula: rt ~ cond + (1 + cond | subj) + (1 + cond | item)
##    Data: df_gibsonwu
## 
## REML criterion at convergence: 8480.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.8275 -0.4036 -0.1886  0.0575  8.4268 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  subj     (Intercept)  25725   160.4        
##           cond         37966   194.8    1.00
##  item     (Intercept)  23834   154.4        
##           cond         20139   141.9    1.00
##  Residual             295557   543.7        
## Number of obs: 547, groups:  subj, 37; item, 15
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   547.33      53.21  10.287
## cond          119.70      67.48   1.774
## 
## Correlation of Fixed Effects:
##      (Intr)
## cond 0.647 
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

From the output:

# Random effects:
#  Groups   Name        Variance Std.Dev. Corr
#  subj     (Intercept)  25725   160.4        
#           cond         37966   194.8    1.00

The Std. of the participants’ means is less variable (16) than in the condition coded +0.5 (194). That’s weird and is probably misestimated because the model is too complex (overparameterised for the given data). Let’s try to fit a simpler model by removing the correlation parameter, which assumes the covariances are zero:

m <- lme4::lmer(rt~cond + 
                  (1+cond||subj) + 
                  (1+cond||item), 
                data = df_gibsonwu)
summary(m)
## Linear mixed model fit by REML ['lmerMod']
## Formula: rt ~ cond + ((1 | subj) + (0 + cond | subj)) + ((1 | item) +  
##     (0 + cond | item))
##    Data: df_gibsonwu
## 
## REML criterion at convergence: 8498.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -1.3384 -0.3850 -0.2069  0.0194  8.8632 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subj     (Intercept)  21936   148.11  
##  subj.1   cond          1196    34.58  
##  item     (Intercept)  22587   150.29  
##  item.1   cond         12390   111.31  
##  Residual             310721   557.42  
## Number of obs: 547, groups:  subj, 37; item, 15
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   548.45      51.67  10.614
## cond          120.57      56.03   2.152
## 
## Correlation of Fixed Effects:
##      (Intr)
## cond 0.003

This singular fit error disapperas and the variance Std.Dev. makes more sense.

Ch. 1 Exercises

Exercise 1.1 Practice using the pnorm function - Part 1

Given a normal distribution with mean 500 and standard deviation 100, use the pnorm function to calculate the probability of obtaining values between 200 and 800 from this distribution.

pnorm(800, 500, 100) - pnorm(200, 500, 100)
## [1] 0.9973002

Exercise 1.2 Practice using the pnorm function - Part 2

Calculate the following probabilities. Given a normal distribution with mean 800 and standard deviation 150, what is the probability of obtaining:

a score of 700 or less a score of 900 or more a score of 800 or more

pnorm(700, 800, 150)
## [1] 0.2524925
pnorm(900, 800, 150, lower.tail=F)
## [1] 0.2524925
pnorm(800, 800, 150, lower.tail=F)
## [1] 0.5

Exercise 1.3 Practice using the pnorm function - Part 3

Given a normal distribution with mean 600 and standard deviation 200, what is the probability of obtaining:

a score of 550 or less. a score between 300 and 800. a score of 900 or more.

pnorm(550, 600, 200)
## [1] 0.4012937
pnorm(800, 600, 200) - pnorm(300, 600, 200)
## [1] 0.7745375
pnorm(900, 600, 200, lower.tail = F)
## [1] 0.0668072

Exercise 1.4 Practice using the qnorm function - Part 1

Consider a normal distribution with mean 1 and standard deviation 1. Compute the lower and upper boundaries such that:

the area (the probability) to the left of the lower boundary is 0.10. the area (the probability) to the left of the upper boundary is 0.90.

qnorm(c(.9,.1), 1,1)
## [1]  2.2815516 -0.2815516

Exercise 1.5 Practice using the qnorm function - Part 2

Given a normal distribution with mean 650 and standard deviation 125. There exist two quantiles, the lower quantile q1 and the upper quantile q2, that are equidistant from the mean 650, such that the area under the curve of the normal between q1 and q2 is 80%. Find q1 and q2.

qnorm(c(.1,.9),650,125)
## [1] 489.8061 810.1939

Exercise 1.6 Practice getting summaries from samples - Part 1

Given data that is generated as follows:

data_gen1 <- rnorm(1000, 300, 200)

Calculate the mean, variance, and the lower quantile q1 and the upper quantile q2, that are equidistant and such that the range of probability between them is 80%.

mean(data_gen1)
## [1] 304.2269
sd(data_gen1)
## [1] 193.2143
qnorm(c(.1,.9),mean(data_gen1), sd(data_gen1))
## [1]  56.61283 551.84099
hist(data_gen1)

Exercise 1.7 Practice getting summaries from samples - Part 2.

This time we generate the data with a truncated normal distribution from the package extraDistr. The details of this distribution will be discussed later in section 4.1 and in the Box 4.1, but for now we can treat it as an unknown generative process:

data_gen2 <- extraDistr::rtnorm(1000, 300, 200, a = 0)

Calculate the mean, variance, and the lower quantile q1 and the upper quantile q2, that are equidistant and such that the range of probability between them is 80%.

mean(data_gen2)
## [1] 327.1436
sd(data_gen2)
## [1] 177.4894
qnorm(c(.1,.9), mean(data_gen2), sd(data_gen2))
## [1]  99.68182 554.60534
hist(data_gen2)

Exercise 1.8 Practice with a variance-covariance matrix for a bivariate distribution.

Suppose that you have a bivariate distribution where one of the two random variables comes from a normal distribution with mean \(\mu\) = 600 and standard deviation \(\sigma\) = 100, and the other from a normal distribution with mean \(\mu\) = 400 and standard deviation \(\sigma\) = 50. The correlation \(\rho\) between the two random variables is 0.4. Write down the variance-covariance matrix of this bivariate distribution as a matrix (with numerical values, not mathematical symbols), and then use it to generate 100 pairs of simulated data points. Plot the simulated data such that the relationship between the random variables X and Y is clear.

## define a variance-covariance matrix:
Sigma4 <- matrix(c(100^2, 100 * 50 * .4, # square sdX = 5^2, mean X * mean Y * their corr
                  100 * 50 * .4, 50^2), # mean X * mean Y * their corr, square sdY = 10^2, 
byrow = FALSE, ncol = 2
)
## generate data:
corr4 <- MASS::mvrnorm(n = 100, # 100 data points
                   mu = c(600, 400), # means for 2 variables
                   Sigma = Sigma4) # make sigma the matrix we just defined
# MASS::mvrnorm = "multivariate rnorm"
plot(corr4[,1],corr4[,2]); corr4_plot <- recordPlot()

Generate two sets of new data (100 pairs of data points each) with correlation - 0.4 and 0, and plot these alongside the plot for the data with correlation 0.4.

## define a variance-covariance matrix:
SigmaMinus4 <- matrix(c(100^2, 100 * 50 * -.4, # square sdX = 5^2, mean X * mean Y * their corr
                  100 * 50 * -.4, 50^2), # mean X * mean Y * their corr, square sdY = 10^2, 
byrow = FALSE, ncol = 2
)
## generate data:
corrMinus4 <- MASS::mvrnorm(n = 100, # 100 data points
                   mu = c(600, 400), # means for 2 variables
                   Sigma = SigmaMinus4) # make sigma the matrix we just defined
# MASS::mvrnorm = "multivariate rnorm"
plot(corrMinus4[,1],corrMinus4[,2]); corrMinus4_plot <- recordPlot()

## define a variance-covariance matrix:
Sigma0 <- matrix(c(100^2, 100 * 50 * 0, # square sdX = 5^2, mean X * mean Y * their corr
                  100 * 50 * 0, 50^2), # mean X * mean Y * their corr, square sdY = 10^2, 
byrow = FALSE, ncol = 2
)
## generate data:
corr0 <- MASS::mvrnorm(n = 100, # 100 data points
                   mu = c(600, 400), # means for 2 variables
                   Sigma = Sigma0) # make sigma the matrix we just defined
# MASS::mvrnorm = "multivariate rnorm"
plot(corr0[,1],corr0[,2]); corr0_plot <- recordPlot()

library(gridGraphics)
ggpubr::ggarrange(corr4_plot,corrMinus4_plot,corr0_plot,
                  nrow=1)

Day 2

Yesterday we look at the examples of discrete variables and continuous variables. In Bayesian, we’re interested in how uncertain we are about the possible true values are: uncertainty quantification (uncertainty of \(\mu\) and \(\sigma\))

  • probability density function (PMF) on the parameter is always the output of Bayesian analysis

  • uniform distribution: rectangle; certainty is constant cause we are equally unsure

Lecture 1.2: Bayes’ Rule

Event A: the streets are wet Event B: it is raining

Q: What’s the probability of the streets being wet (B) given that it is raining? P(A|B)

P(A|B) = P(A,B)/P(B), given P(B)>0 P(A,B) = P(B,A)

  • but in research we don’t work with discrete events

  • instead of P(A|B), we look at f(y|\(\theta\)), which represents probabiity density function for a given distribution

  • given the data y, we assume some density distribution that generated the data; basically y is our priors, \(\theta\) represents posterior distribution of parameters from given data

  • f(y) (‘f of y’) =

# density of a data point from a particular normal distribution
dnorm(0.1,0,1) 
## [1] 0.3969525
# what is the density on the curve at data point 0.1, given a mean of 0 and sd of 1?
  • \(\mu\) has some prior, as does \(\sigma\) which has a uniform distribution
# produce random datapoint of sigma given a uniform distribution
runif(1, min=0,max=1)
## [1] 0.8617403
k = 46  # 46 succesess
n = 100 # 100 trials
theta <- seq(0,1,by=.01)
options(scipen = 0); dbinom(k,n,theta); options(scipen=999)
##   [1] 0.000000e+00 4.269888e-64 1.736616e-50 1.257093e-42 4.013489e-37
##   [6] 6.543511e-33 1.621726e-29 1.093214e-26 2.836599e-24 3.544105e-22
##  [11] 2.484342e-20 1.089532e-18 3.239795e-17 6.943042e-16 1.124403e-14
##  [16] 1.428683e-13 1.467968e-12 1.250211e-11 9.007395e-11 5.584350e-10
##  [21] 3.022422e-09 1.445661e-08 6.175439e-08 2.377363e-07 8.313177e-07
##  [26] 2.658671e-06 7.823531e-06 2.129529e-05 5.386916e-05 1.271671e-04
##  [31] 2.811822e-04 5.842581e-04 1.144185e-03 2.117369e-03 3.711237e-03
##  [36] 6.174009e-03 9.766739e-03 1.471584e-02 2.115011e-02 2.903347e-02
##  [41] 3.811036e-02 4.788307e-02 5.763615e-02 6.651309e-02 7.363639e-02
##  [46] 7.824864e-02 7.984344e-02 7.825541e-02 7.368743e-02 6.666918e-02
##  [51] 5.795840e-02 4.840970e-02 3.884155e-02 2.992887e-02 2.213868e-02
##  [56] 1.571359e-02 1.069571e-02 6.976794e-03 4.357780e-03 2.603975e-03
##  [61] 1.487007e-03 8.105450e-04 4.211607e-04 2.082928e-04 9.788807e-05
##  [66] 4.363196e-05 1.840766e-05 7.333472e-06 2.751850e-06 9.698523e-07
##  [71] 3.200174e-07 9.851262e-08 2.818023e-08 7.457801e-09 1.816920e-09
##  [76] 4.052234e-10 8.221447e-11 1.506581e-11 2.473389e-12 3.604160e-13
##  [81] 4.611849e-14 5.118247e-15 4.855885e-16 3.872130e-17 2.543561e-18
##  [86] 1.343738e-19 5.545729e-21 1.725610e-22 3.873523e-24 5.932821e-26
##  [91] 5.771270e-28 3.244259e-30 9.272894e-33 1.126226e-35 4.468531e-39
##  [96] 3.852849e-43 3.646130e-48 1.052358e-54 5.225588e-64 4.627377e-80
## [101] 0.000000e+00

Beta distribution

  • ranges 0-1

  • can change the shape of the distribution by changing the parameters (a and b), in R called shape1 and shape2

  • a = number of successes you expect a prior

  • b = number of failures

  • the size of the parameters also control trial uncertainty;the higher the numbers, the tighter (i.e., narrower) the curve

  • if we don’t have much prior knowledge, we can set a and b both to 1, which would give us a uniform distribution, and an uniformative prior

  • if we have stronger prior knowledge/a strong belief that \(\theta\) has a particular range of values, we can set a and b to have higher values

x <- 46
n <- 100
## Prior specification:
a <- 210
b <- 21
binom_lh <- function(theta) {
dbinom(x=x, size =n, prob = theta)
}
## normalizing constant:
K <- 1/integrate(f = binom_lh, lower = 0, upper = 1)$value
binom_scaled_lh <- function(theta) K * binom_lh(theta) # compute likelihood, scaling it with normalising constant
library(ggplot2)
## Likelihood
p_beta <- ggplot(data = tibble::tibble(theta = c(0, 1)), aes(theta)) +
  stat_function(
    fun = dbeta,
    args = list(shape1 = a, shape2 = b),
    aes(linetype = "Prior")
  ) +
  ylab("density") +
  stat_function(
    fun = dbeta,
    args = list(shape1 = x + a, shape2 = n - x + b), aes(linetype = "Posterior")
  ) +
  stat_function(
    fun = binom_lh,
    aes(linetype = "Non-scaled likelihood")
  ) +
  stat_function(
    fun = binom_scaled_lh,
    aes(linetype = "Scaled likelihood")
  ) +
  theme_bw() +
  theme(legend.title = element_blank())
p_beta

plot(theta, 
     dbeta(theta,shape1=10,shape2=80), # distribution given priors of 10 succeses & 80 failures
     type="l")

Poisson distribution

  • from Lecture 2.3
  • can be used to model e.g., number of fixations in a region
# simulate 10 random data points with Poisson distribution
# lambda = expected rate of 'successes', e.g., number of regressions into a region
(x<-rpois(n=10,lambda=3))
##  [1] 2 3 2 2 2 2 0 5 5 6
x<-rpois(n=1000,lambda=3)
plot(x,dpois(x,lambda=3),
     ylab="Probability")

Gamma distribution

  • in R, the a and b parameters are called shape and scale
  • mean of gamma distr. = a/b
  • variance of gramma distr. = a/b^2
round(rgamma(n=10, shape=3,scale=1),2)
##  [1] 4.31 3.23 2.65 4.45 3.11 1.18 2.36 1.14 3.30 3.25
# plot gamma distribution given your parameter values
x<-seq(0,10,by=0.01)
plot(x,dgamma(x,shape=3,scale=1),type="l",
     ylab="density")

lambda <- rgamma(4000,shape=20,rate=7) # sample from the posterior distribution
hist(lambda)

Computational Bayesian

library(brms)
fit_press <- brm(rt ~ 1,
data = df_spacebar,
family = gaussian(),
prior = c(
prior(uniform(0, 60000), class = Intercept, lb = 0,
ub = 60000),
prior(uniform(0, 2000), class = sigma, lb = 0,
ub = 2000)
),
chains = 4,
iter = 2000,
warmup = 1000
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
fit_press
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: rt ~ 1 
##    Data: df_spacebar (Number of observations: 361) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept   168.66      1.31   166.08   171.22 1.00     3830     2470
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma    24.99      0.93    23.31    26.83 1.00     3431     2969
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
# fit the model
# The model specification:
brm(rt ~ 1, data = df_spacebar,
# The likelihood assumed:
family = gaussian(),
# The prior specification:
prior = c(
prior(uniform(0, 60000), class = Intercept),
prior(uniform(0, 2000), class = sigma)
),
# Sampling specifications:
chains = 4,
iter = 2000,
warmup = 1000
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: rt ~ 1 
##    Data: df_spacebar (Number of observations: 361) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept   168.64      1.31   166.09   171.25 1.00     3338     2550
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma    25.02      0.96    23.23    26.96 1.00     3775     2495
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Day 3

Can we say that a 95% CI (confidence) contains the true value of \(\mu\) with probability of 95%? i.e., P(lowerCI < \(\mu\) < upperCI) = .95

No: from a frequentist perspective, \(\mu\) is a point value. It would have to be a random variable to talk about the probability of it lying within a range.

Fisher and Pearson were working infields with very tightly controlled designs/high power. In such cases, frequentist is totally fine and p-values can be trusted.

Sampling algorithms

Not from the textbook, somebody in the class requested this (slides 02_sampling_algorithms.pdf/.Rmd)

  • CDF (cumultive density function) has S shape, PDF is Gaussian

  • CDF is like that cause f(x)q is plotted on the y-axis, and for PDF along the x-axis

  • random values sampled from a uniform distribution CDF will like between 0 and 1, and the samples wil have a Gaussian distribution

# sampling from normal distribution
nsim<-10000
samples<-rep(NA,nsim)
for(i in 1:nsim){
u <- runif(1,min=0,max=1)
samples[i]<-qnorm(u)
}
hist(samples,freq=FALSE,
main="Standard Normal")

nsim<-10000
samples<-rep(NA,nsim)
for(i in 1:nsim){
u <- runif(1,min=0,max=1)
samples[i]<-qexp(u)
}
hist(samples,freq=FALSE,main="Exponential")

# sampling from gamma distribution
nsim<-10000
samples<-rep(NA,nsim)
for(i in 1:nsim){
u <- runif(1,min=0,max=1)
samples[i]<-qgamma(u,rate=50,shape=50)
}
hist(samples,freq=FALSE,main="Gamma")

# shape = 'a' paramter (shape1); number of success
# rate = 'b' parameter (shape2); number of failures
# scale = inverse of rate
# play around with vaues of shape and rate to see how it affects the distribution
x = seq(0,1000,by=.01)

plot(x,dgamma(x,
              shape = 100, # no. of successes
              rate = 50), # no. of failures
     type = "l")

Computational Bayes continued

fit_press <- brm(rt ~ 1,
data = df_spacebar,
family = gaussian(),
prior = c(
prior(uniform(0, 60000), class = Intercept, lb = 0,
ub = 60000),
prior(uniform(0, 2000), class = sigma, lb = 0,
ub = 2000)
),
chains = 4,
iter = 2000,
warmup = 1000
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
plot(fit_press)

Plot with shinystan for some diagnostic plots.

# try with shinystan
# install.packages("shinystan")
library(shinystan)
shinystan::launch_shinystan(fit_press)

Extract posteriors from the model and compute summary stats:

as_draws_df(fit_press) %>% head(3)
## # A draws_df: 3 iterations, 1 chains, and 4 variables
##   b_Intercept sigma lprior  lp__
## 1         168    26    -19 -1683
## 2         169    24    -19 -1683
## 3         166    25    -19 -1685
## # ... hidden reserved variables {'.chain', '.iteration', '.draw'}

What happens if we change the prior and make it very tight/informative?

fit_press_inform <- brm(rt ~ 1,
data = df_spacebar,
family = gaussian(),
prior = c(
prior(uniform(0, 50), class = Intercept, lb = 0,
ub = 50),
prior(uniform(0, 20), class = sigma, lb = 0,
ub = 20)
),
chains = 4,
iter = 2000,
warmup = 1000
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
plot(fit_press_inform)

This produces a biased posterior based on unreasonable priors.

Prior Predictive distributions

  1. Take one sample from each of the priors
mu<-runif(1,min=0,max=60000)
sigma<-runif(1, 0, 2000)
  1. Plug those samples into the probability density/mass function used as the likelihood in the model to generate a data set
y_pred_1<-rnorm(n=5,mu,sigma)
y_pred_1
## [1] 9569.189 6692.092 5385.050 1812.395 2792.124
  • each smaple is an imaginary or potential data set

  • there’s code for generating prior predictive daa in R

  • continuum of priors: flat/uninformative, regularising/weakly informative, principled, informative

Lecture 3.4 Sensitiviy analysis

library(brms)
# fit the model with informative priors
fit_press <- brm(rt ~ 1,
data = df_spacebar,
family = gaussian(),
prior = c(
prior(uniform(10,400), class = Intercept, lb = 10,
ub = 400),
prior(uniform(10,100), class = sigma, lb = 10,
ub = 100)
),
chains = 4,
iter = 2000,
warmup = 1000
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
  • Bayesian p-hacking: using an informative prior without running a sensitivy analysis, comparing posteriors fro a range of priors (uniformative, informative/principled, and when relevant a competing informative/principled prior)
as_draws_df(fit_press)$b_Intercept %>% quantile(c(0.025, .975))
##     2.5%    97.5% 
## 166.1124 171.1980

Posterior predictive distributions

Can plot simulated posteriors (blue lines) and observed data (thicker line). Here, we see the prior was far too informative, as the curves don’t have similar range/peak (along x-axis).

pp_check(fit_press, ndraws = 100, type = "dens_overlay")

Lecture 3.6 - Log-normal likelihood

mu <- 6
sigma <- 0.5
N <- 500000
# Generate N random samples from a log-normal distribution
sl <- rlnorm(N, mu, sigma)
  • Shravan has info on BoxCox test in his LM lecture notes (gitHub)

Generate prior predictive distributions

data("df_spacebar")
# create function
normal_predictive_distribution <-
  function(mu_samples, sigma_samples, N_obs) {
    # empty data frame with headers:
    df_pred <- tibble(
      trialn = numeric(0),
      rt_pred = numeric(0),
      iter = numeric(0)
    )
    # i iterates from 1 to the length of mu_samples,
    # which we assume is identical to
    # the length of the sigma_samples:
    for (i in seq_along(mu_samples)) {
      mu <- mu_samples[i]
      sigma <- sigma_samples[i]
      df_pred <- bind_rows(
        df_pred,
        tibble(
          trialn = seq_len(N_obs), # 1, 2,... N_obs
          rt_pred = rnorm(N_obs, mu, sigma),
          iter = i
        )
      )
    }
    df_pred
  }

N_samples <- 1000
N_obs <- nrow(df_spacebar) # n = whatever it is for the dataset
mu_samples <- runif(N_samples, 0, 11) # 1000 samples for mu
sigma_samples <- runif(N_samples, 0, 1) # 1000 samples for sigma
prior_pred_ln <- normal_predictive_distribution( # this function takes vector of samples of mu and sigma and produces the number of observed data points repeated
mu_samples = mu_samples,
sigma_samples = sigma_samples,
N_obs = N_obs
) %>%
mutate(rt_pred = exp(rt_pred)) # exp() cause we had log transformed
  • can also use the brm function and put in a fake dataset (df_spacebar_ref with empty colmobe ‘rt’) and plugging it into our model
    • because brm always assumes some data is put into the model (But we’re useing a dummy dataset), but will take the lognormal() instead of gaussian(), specify priors, and use ‘sample_prior=“only”’, so brm will only produce prior predictive data
df_spacebar_ref <- df_spacebar %>%
mutate(rt = rep(1, n()))
fit_prior_press_ln <- brm(rt ~ 1,
data = df_spacebar_ref,
family = lognormal(),
prior = c(
prior(normal(6, 1.5), class = Intercept),
prior(normal(0, 1), class = sigma)
),
sample_prior = "only",
control = list(adapt_delta = .9)
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
# _ln for log normal
fit_press_ln <- brm(rt ~ 1,
data = df_spacebar,
family = lognormal(),
prior = c(
prior(normal(6, 1.5), class = Intercept),
prior(normal(0, 1), class = sigma)
)
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

Lecture 3.7 - Computational Bayes

Same model as above:

# only run this if you need to (exact same as above chunk)
# fit_press_ln <- brm(rt ~ 1,
# data = df_spacebar,
# family = lognormal(),
# prior = c(
# prior(normal(6, 1.5), class = Intercept),
# prior(normal(0, 1), class = sigma)
# )
# )

Back transform to raw (from log)

estimate_ms <- exp(as_draws_df(fit_press_ln)$b_Intercept) 
c(mean = mean(estimate_ms), quantile(estimate_ms, probs = c(.025, .975))) # distribution of predicated average reading times, the range over which we can be 95% sure the true value lies
##     mean     2.5%    97.5% 
## 167.0900 164.8565 169.4634
# plot
pp_check(fit_press_ln, ndraws = 100)

pp_check(fit_press, type = "stat", stat = "min")+ ggtitle("Normal model")

pp_check(fit_press_ln, type = "stat", stat = "min") + ggtitle("Log-normal model") # _ln for log normal

Lecture 4.1 - Regression models

Example experiment: dot tracking - RQ: how does attentional load affect pupil size - DV: pupil size (in abstract units)

data("df_pupil_pilot")
df_pupil_pilot$p_size %>% summary()
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   851.5   856.0   862.0   860.8   866.5   868.0
qnorm(c(.025, .975), mean = 1000, sd = 500)
## [1]   20.01801 1979.98199
extraDistr::qtnorm(c(.025,0.975), mean = 0, sd = 1000, a = 0)
## [1]   31.33798 2241.40273

Adding a predictor to your model

  • contrast coding is a form of centring
  • Shravan doesn’t use +/-.5 in dummy coding because the interaction is scaled down (in half), as demonstrated below
data("df_persianE1")
head(df_persianE1)
##     subj item   rt distance   predability
## 60     4    6  568    short   predictable
## 94     4   17  517     long unpredictable
## 146    4   22  675    short   predictable
## 185    4    5  575     long unpredictable
## 215    4    3  581     long   predictable
## 285    4    7 1171     long   predictable

This data is repeated measures 2x2. Arbitrary condition labels: - a = short and predicatble - b = short and unpredictable - c = long predicatbe - d = long unpredictable

# set sum contrasts with +/- 1
df_persianE1$ME_p1 <- ifelse(df_persianE1$predability == "predictble", +1, -1)
df_persianE1$ME_d1 <- ifelse(df_persianE1$distance == "short", +1, -1)

# set sum contrasts with +/- .5
df_persianE1$ME_p5 <- ifelse(df_persianE1$predability == "predictble", +.5, -.5)
df_persianE1$ME_d5 <- ifelse(df_persianE1$distance == "short", +.5, -.5)

# two new columns now
head(df_persianE1)
##     subj item   rt distance   predability ME_p1 ME_d1 ME_p5 ME_d5
## 60     4    6  568    short   predictable    -1     1  -0.5   0.5
## 94     4   17  517     long unpredictable    -1    -1  -0.5  -0.5
## 146    4   22  675    short   predictable    -1     1  -0.5   0.5
## 185    4    5  575     long unpredictable    -1    -1  -0.5  -0.5
## 215    4    3  581     long   predictable    -1    -1  -0.5  -0.5
## 285    4    7 1171     long   predictable    -1    -1  -0.5  -0.5
# get interaction terms
df_persianE1$Int1 <- df_persianE1$ME_p1 * df_persianE1$ME_d1
df_persianE1$Int5 <- df_persianE1$ME_p5 * df_persianE1$ME_d5
head(df_persianE1)
##     subj item   rt distance   predability ME_p1 ME_d1 ME_p5 ME_d5 Int1  Int5
## 60     4    6  568    short   predictable    -1     1  -0.5   0.5   -1 -0.25
## 94     4   17  517     long unpredictable    -1    -1  -0.5  -0.5    1  0.25
## 146    4   22  675    short   predictable    -1     1  -0.5   0.5   -1 -0.25
## 185    4    5  575     long unpredictable    -1    -1  -0.5  -0.5    1  0.25
## 215    4    3  581     long   predictable    -1    -1  -0.5  -0.5    1  0.25
## 285    4    7 1171     long   predictable    -1    -1  -0.5  -0.5    1  0.25

In the output, the interaction of Int1 has the right scale, but the interaction in Int5 is scaled down.

Ch. 3 exercises

Exercise 3.1 A simple linear model.

  1. Fit the model fit_press with just a few iterations, say 50 iterations (set warmup to the default of 25, and use four chains). Does the model converge?
library(brms)
fit_press_1a <- brm(rt ~ 1,
data = df_spacebar,
family = gaussian(),
prior = c(
prior(uniform(0, 60000), class = Intercept, lb = 0,
ub = 60000),
prior(uniform(0, 2000), class = sigma, lb = 0,
ub = 2000)
),
chains = 4,
iter = 50,
warmup = 25
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
plot(fit_press_1a)

pp_check(fit_press_1a, ndraws = 100, type = "dens_overlay")

My answer: yes model coverges.

  1. Using normal distributions, choose priors that better represent your assumptions/beliefs about response times. To think about a reasonable set of priors for \(\mu\) and \(\sigma\), you should come up with your own subjective assessment about what you think a reasonable range of values can be for \(\mu\) and how much variability might happen. There is no correct answer here, we’ll discuss priors in depth in chapter 6.
library(brms)

fit_press_1b <- brm(rt ~ 1,
data = df_spacebar,
family = gaussian(),
prior = c(
prior(normal(100, 4000), class = Intercept, lb = 100,
ub = 4000),
prior(normal(20, 50), class = sigma, lb = 20,
ub = 50)
),
chains = 4,
iter = 50,
warmup = 25
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
plot(fit_press_1b)

pp_check(fit_press_1b, ndraws = 100, type = "dens_overlay")

Exercise 3.2 Revisiting the button-pressing example with different priors.

  1. Can you come up with very informative priors that influence the posterior in a noticeable way (use normal distributions for priors, not uniform priors)? Again, there are no correct answers here; you may have to try several different priors before you can noticeably influence the posterior.
library(brms)
fit_press_2a <- brm(rt ~ 1,
data = df_spacebar,
family = gaussian(),
prior = c(
prior(normal(120, 160), class = Intercept, lb = 120,
ub = 160),
prior(normal(20, 40), class = sigma, lb = 20,
ub = 40)
),
chains = 4,
iter = 50,
warmup = 25
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
plot(fit_press_2a)

pp_check(fit_press_2a, ndraws = 100, type = "dens_overlay")

  1. Generate and plot prior predictive distributions based on this prior and plot them.
fit_press_2b <- brm(rt ~ 1,
                 data = df_spacebar,
                 family = gaussian(),
                 prior = c(
                   prior(normal(120, 160), class = Intercept, lb = 120, ub = 160),
                   prior(normal(20, 40), class = sigma, lb = 20, ub = 40)
                 ),
                 chains = 4,
                 iter = 50,
                 warmup = 25,
                 ## uncomment for prior predictive:
                 sample_prior = "only"
                 ## uncomment when dealing with divergent transitions
                 ## control = list(adapt_delta = .9)
)

## Plot prior predictive distribution of statistical summaries:
pp_check(fit_press_2b, ndraws = 100, type = "stat", stat = "mean",
         prefix = "ppd")
options(sci_pen=0)

N_samples <- 1000
N_obs <- nrow(df_spacebar) # n = whatever it is for the dataset
mu_samples <- runif(N_samples, 120, 160) # 1000 samples for mu
sigma_samples <- runif(N_samples, 20, 40) # 1000 samples for sigma

post_pred_2c <- normal_predictive_distribution( # this function takes vector of samples of mu and sigma and produces the number of observed data points repeated
mu_samples = mu_samples,
sigma_samples = sigma_samples,
N_obs = N_obs
) %>%
mutate(rt_pred = exp(rt_pred)) # exp() cause we had log transformed
  1. Generate posterior predictive distributions based on this prior and plot them.
posterior_predict(fit_press)
plot(fit_press_2b)
# posterior predictive check
pp_check(fit_press_2b)

## Plot poterior predictive distribution of statistical summaries:
pp_check(fit_press_2b, ndraws = 100, type = "stat", stat = "mean",
         prefix = "ppc")

Exercise 3.3 Posterior predictive checks with a log-normal model.

  1. For the log-normal model fit_press_ln, change the prior of \(\sigma\) so that it is a log-normal distribution with location (\(\mu\)) of -2 and scale (\(\sigma\)) of .5. What does such a prior imply about your belief regarding button-pressing times in milliseconds? Is it a good prior? Generate and plot prior predictive distributions. Do the new estimates change compared to earlier models when you fit the model?
location <- -2 # mu
scale <- .5 # sigma

logsd <- exp(location + scale^2)*sqrt(exp(scale^2)-1)

# _ln for log normal
fit_press_ln <- brm(rt ~ 1,
data = df_spacebar,
family = lognormal(),
prior = c(
prior(normal(6, 1.5), class = Intercept),
prior(normal(0, 1), class = sigma)
)
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
  1. For the log-normal model, what is the mean (rather than median) time that takes to press the space bar, what is the standard deviation of the response times in milliseconds?

Exercise 3.4 A skew normal distribution.

Would it make sense to use a “skew normal distribution” instead of the lognormal? The skew normal distribution has three parameters: location \(\xi\) (pronounced xi), scale \(\omega\) (omega), and shape \(\alpha\). The distribution is right skewed if \(\alpha\) > 0, is left skewed if \(\alpha\) < 0, and is identical to the regular normal distribution if \(\alpha\) = 0. For fitting this in brms, one needs to change family and set it to skew_normal(), and add a prior of class = alpha (location remains class = Intercept and scale, class = sigma).

  1. Fit this model with a prior that assigns approximately 95% of the prior probability of alpha to be between 0 and 10.

  2. Generate posterior predictive distributions and compare the posterior distribution of summary statistics of the skew normal with the normal and log-normal.

From dataset from a course participant

## setting priors
# mu intercept+
# sigma = residual error
# sd = spread of predictors
# exp(intercept+sigma*sd)-exp(intercept)
exp(6.5+.5*1)-exp(6.5)
# file = "mod_1" # save your model

Day 4

Lecture 4.1 - Regression models

  • pilot data helps work out priors
library(bcogsci)
df_pupil_pilot$p_size %>% summary()
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   851.5   856.0   862.0   860.8   866.5   868.0

Okay, let’s say our prior is Normal(1000,500):

qnorm(c(.025,.975), mean = 1000, sd = 500)
## [1]   20.01801 1979.98199

Then an uninformative prior truncated at 0 (indicated by a = 0) would be:

extraDistr::qtnorm(c(.025,.975), mean = 0, sd = 1000, a = 0)
## [1]   31.33798 2241.40273
# a = 0 indicates a truncated normal distribution (truncated at the left by zero, meaning no negative values)

This our expected range of standard deviation with the flat prior.

But really we care about Beta. Here we start with an agnostic position, with priors not truncated at 0 meaning it allows for negative values (i.e., that pupil size could increase or decrease with an increase of cognitive load). i.e., like a two-sided t-test.

qnorm(c(.025, .975), mean = 0, sd = 100)
## [1] -195.9964  195.9964
(df_pupil <- df_pupil %>%
mutate(c_load = load - mean(load)))
## # A tibble: 41 × 5
##     subj trial  load p_size c_load
##    <int> <int> <int>  <dbl>  <dbl>
##  1   701     1     2  1021. -0.439
##  2   701     2     1   951. -1.44 
##  3   701     3     5  1064.  2.56 
##  4   701     4     4   913.  1.56 
##  5   701     5     0   603. -2.44 
##  6   701     6     3   826.  0.561
##  7   701     7     0   464. -2.44 
##  8   701     8     4   758.  1.56 
##  9   701     9     2   733. -0.439
## 10   701    10     3   591.  0.561
## # … with 31 more rows
# here we will get the posterior 
fit_pupil <- brm(p_size ~ 1 + c_load,
data = df_pupil,
family = gaussian(),
prior = c(
prior(normal(1000, 500), class = Intercept),
prior(normal(0, 1000), class = sigma),
prior(normal(0, 100), class = b, coef = c_load)
)
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
# if we add 'sample_prior = "only"' the model will ignore the data and only generate prior predictive checks
  • pp_check spits out either posteriior or prior predictive checks depending on what you plug into it (a model with only prior or with posterior)
plot(fit_pupil)

We see in b_c_load the slope: it seems an increase in one unit of cognitive load corresponds to an increase in pupil size by about 30 units, with a spread of about 10 to 60 units. This is a pretty wide spread.

If we want to say that we have evidence that an increase in cognitive load corresponds to an increase in pupil size, we need a model comparison. You cannot argue for evidence without doing a model comparison. Evidence is a likelihood ratio, we should compare two models and the better model has a higher lik.ratio.

H1: \(\beta\) > 0

Given the data and the model that I have, we can have support for the H1 or not.

ROPE method: check if you have a pattern in your data consistent with the predicted pattern or not (based on your priors, either from pilot data, past data, or the literature).

# short_summary is a function written in the source code for the slides; the code is available in the source code for this file
short_summary(fit_pupil)
## ...
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept   701.29     19.93   662.76   740.27 1.00     3671     2597
## c_load       33.81     11.94    10.00    56.64 1.00     3691     2581
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma   128.61     14.85   103.13   161.73 1.00     3350     2776
## 
## ...

You can generate data for different levels of c_load. For c_load= 0 (because load is not centered, this is the baseline condition), so lots of variabilitiy. The average load

for (l in 0:4) { # for each level of the predictor load
  df_sub_pupil <- filter(df_pupil, load == l)
  p <- pp_check(fit_pupil,
    type = "dens_overlay",
    ndraws = 100,
    newdata = df_sub_pupil
  ) +
    geom_point(data = df_sub_pupil, aes(x = p_size, y = 0.0001)) +
    ggtitle(paste("load: ", l)) +
    coord_cartesian(xlim = c(400, 1000))
  print(p)
}

If you have n conditions, you have n-1 comparions.

fit_pupil
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: p_size ~ 1 + c_load 
##    Data: df_pupil (Number of observations: 41) 
##   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup draws = 4000
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept   701.29     19.93   662.76   740.27 1.00     3671     2597
## c_load       33.81     11.94    10.00    56.64 1.00     3691     2581
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma   128.61     14.85   103.13   161.73 1.00     3350     2776
## 
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Lecture 4.3

Looking at the effect of trial on the button-pressing data.

df_spacebar <- df_spacebar %>%
mutate(c_trial = trial - mean(trial))

The priors have to be defined on the log scale.

\(\alpha\) ~ Normal(6,1.5)

df_spacebar_ref <- df_spacebar %>%
mutate(rt = rep(1, n()))
fit_prior_press_trial <- brm(rt ~ 1 + c_trial,
data = df_spacebar_ref,
family = lognormal(),
prior = c(
prior(normal(6, 1.5), class = Intercept),
prior(normal(0, 1), class = sigma),
prior(normal(0, 1), class = b, coef = c_trial)
),
sample_prior = "only", # predictive priors ONLY, will ignore the data
control = list(adapt_delta = .9)
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
median_diff <- function(x) {
median(x - lag(x), na.rm = TRUE)
}
pp_check(fit_prior_press_trial,
type = "stat",
stat = "median_diff",
# show only prior predictive distributions
prefix = "ppd",
# each bin has a width of 500ms
binwidth = 500) +
# cut the top of the plot to improve its scale
coord_cartesian(ylim = c(0, 50))+theme_bw()

What if we change to more informative priors?

\(\beta\) ~ Normal(0,.01)

exp(6) - exp(6 + 0.02)
## [1] -8.149802
# 0.02 because 2 times the sd (0.01) will account for alpha = .05 (the two tails at either end of the distribution)

Prior predictive distribution

# generate prior predicitve distribution
fit_prior_press_trial <- brm(rt ~ 1 + c_trial,
data = df_spacebar_ref,
family = lognormal(),
prior = c(
prior(normal(6, 1.5), class = Intercept),
prior(normal(0, 1), class = sigma),
prior(normal(0, .01), class = b, coef = c_trial)
),
sample_prior = "only", # again, just priors
control = list(adapt_delta = .9)
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

Plot prior predictive distr.

median_diff <- function(x) {
median(x - lag(x), na.rm = TRUE)
}
pp_check(fit_prior_press_trial,
type = "stat",
stat = "median_diff",
# show only prior predictive distributions
prefix = "ppd")

# each bin has a width of 500ms
# binwidth = 500)

Posterior predictive distribution

# generate posterior
fit_press_trial <- brm(rt ~ 1 + c_trial,
data = df_spacebar,
family = lognormal(),
prior = c(
prior(normal(6, 1.5), class = Intercept),
prior(normal(0, 1), class = sigma),
prior(normal(0, .01), class = b, coef = c_trial)
)
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

Plot.

pp_check(fit_press_trial)

Lecture 4.5: Logistic regression

  • IV: set size (we’ll treat it as continuous)
  • DV: binary responses (0,1): Bernoulli distr.
head(df_recall)
## # A tibble: 6 × 7
##   subj  set_size correct trial session block tested
##   <chr>    <int>   <int> <int>   <int> <int>  <int>
## 1 10           4       1     1       1     1      2
## 2 10           8       0     4       1     1      8
## 3 10           2       1     9       1     1      2
## 4 10           6       1    23       1     1      2
## 5 10           4       1     5       1     2      3
## 6 10           8       0     7       1     2      5
# Set sizes (we'll treat it as a continuous IV) in the data set:
df_recall$set_size %>%
unique() %>% sort()
## [1] 2 4 6 8
# Trials by set size
df_recall %>%
group_by(set_size) %>%
count()
## # A tibble: 4 × 2
## # Groups:   set_size [4]
##   set_size     n
##      <int> <int>
## 1        2    23
## 2        4    23
## 3        6    23
## 4        8    23

Lecture 4.6: Logistic regression

To determine priors in the log odds space:

qlogis() plogis()

alpha <- rnorm(1000,0,1.5)
hist(alpha)

hist(plogis(alpha))

Lecture 4.8 - Hierarchical regression models

head(df_eeg <- df_eeg %>%
mutate(c_cloze = cloze - mean(cloze)))
## # A tibble: 6 × 7
##    subj cloze  item  n400 cloze_ans     N c_cloze
##   <dbl> <dbl> <dbl> <dbl>     <dbl> <dbl>   <dbl>
## 1     1  0        1  7.08         0    44  -0.476
## 2     1  0.03     2 -0.68         1    44  -0.446
## 3     1  1        3  1.39        44    44   0.524
## 4     1  0.93     4 22.8         41    44   0.454
## 5     1  0        5  1.61         0    44  -0.476
## 6     1  0        6  3.01         0    44  -0.476
# pre-define priors for your model to keep the syntax cleaner
prior_v <-
c(
prior(normal(0, 10), class = Intercept),
prior(normal(0, 10), class = b, coef = c_cloze),
prior(normal(0, 50), class = sigma),
prior(normal(0, 20), class = sd, coef = Intercept, group = subj), # prior for intercept variance by participant
prior(normal(0, 20), class = sd, coef = c_cloze, group = subj)
)
fit_N400_v <- brm(n400 ~ c_cloze + (c_cloze || subj),
prior = prior_v, # priors we defined in the previous chunk
data = df_eeg
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
prior_h <- c(
  prior(normal(0, 10), class = Intercept),
  prior(normal(0, 10), class = b, coef = c_cloze),
  prior(normal(0, 50), class = sigma),
  prior(normal(0, 20),
    class = sd, coef = Intercept,
    group = subj
  ),
  prior(normal(0, 20),
    class = sd, coef = c_cloze,
    group = subj
  ),
  prior(lkj(2), class = cor, group = subj) # lkj prior on the correlation parameter rho
  
)
fit_N400_h <- brm(n400 ~ c_cloze + (c_cloze | subj),
  prior = prior_h,
  data = df_eeg
)
## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppParallel/include/"  -I"/Library/Frameworks/R.framework/Versions/4.2/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DUSE_STANC3 -DSTRICT_R_HEADERS  -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION  -DBOOST_NO_AUTO_PTR  -include '/Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/StanHeaders/include/stan/math/prim/fun/Eigen.hpp:22:
## In file included from /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.2/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1
plot(fit_N400_h)

There’s a new list of priors in this model with the prior(lkj(2),... line. LKJ prior: regularising prior for all the corelations; this is why we don’t run into convergence issues with sparse data with hierarchical models in brm.

In the pot, the correlation parameter cor_subj__intercept_c_cloze is wide spread cause there’s not much data. This is telling us it’s being estimated with great uncertainty. The spread is so large we didn’t learn much. What would happen if we fit a frequentist model?

fit_N400_freq <- lmer(n400 ~ c_cloze + (c_cloze | subj),
  # prior = prior_h,
  data = df_eeg
)
summary(fit_N400_freq)
## Linear mixed model fit by REML ['lmerMod']
## Formula: n400 ~ c_cloze + (c_cloze | subj)
##    Data: df_eeg
## 
## REML criterion at convergence: 22216.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -6.0199 -0.6047  0.0075  0.6428  4.3428 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  subj     (Intercept)   4.394   2.096       
##           c_cloze       3.363   1.834   0.30
##  Residual             134.737  11.608       
## Number of obs: 2863, groups:  subj, 37
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   3.6565     0.4073   8.977
## c_cloze       2.3416     0.6150   3.807
## 
## Correlation of Fixed Effects:
##         (Intr)
## c_cloze 0.125

Shrinkage

LMMs allow you to take into account not ony the effect of the individuals but also the entire average behaviour from your sample. Individual differences: one extreme way to investigate ind. diff’s would be to run a separate model for each px, this is called the ‘no pooling’ method and ignores the global average. ‘Complete pooling’ is the other extreme, ignoring the individual differences andfocussing on the global average.

‘Partial pooling’ model is a compromise between ‘no pooling’ and ‘complete pooling’, computing both the global average and the individual-level averageds/differences.

  • shrinking individual estimates towards to the global mean
  • this corrects for px with extreme behaviour
  • if a participant’s data is not extreme, the shrunken estimate will be the same as the pre-shrunken estimate
  • it is skeptical of the indidivual level data

The practical implication of shrinkage:

  • the LMM is unaffected by missing data; even if you lose/delete half of a participant’s data, the LMM estimate will be the same
  • shrinkage occurs automatically when running an LMM, we don’t need to specify anything
  • it is present for both frequentist and Bayesian LMMs

Day 5